在医疗诊断的世界中,采用各种深度学习技术是非常普遍的,也是有效的,并且当涉及到视网膜光学相干断层扫描(OCT)行业时,其陈述同样是正确的,但(i)这些技术有防止医疗专业人员完全信任的黑匣子特征(ii)这些方法的缺乏精度限制了它们在临床和复杂病例中的实施(iii)OCT分类上的现有工程和模型基本上是大而复杂,它们需要相当大量的内存和计算能力,从而降低实时应用中分类器的质量。为了满足这些问题,在本文中,提出了一种自我开发的CNN模型,而且使用石灰的使用相对较小,更简单,引入了可解释的AI对研究,并有助于提高模型的可解释性。此外,此外将成为医疗专家的资产,以获得主要和详细信息,并将帮助他们做出最终决策,并将降低传统深度学习模式的不透明度和脆弱性。
translated by 谷歌翻译
Deep neural networks (DNNs) are vulnerable to a class of attacks called "backdoor attacks", which create an association between a backdoor trigger and a target label the attacker is interested in exploiting. A backdoored DNN performs well on clean test images, yet persistently predicts an attacker-defined label for any sample in the presence of the backdoor trigger. Although backdoor attacks have been extensively studied in the image domain, there are very few works that explore such attacks in the video domain, and they tend to conclude that image backdoor attacks are less effective in the video domain. In this work, we revisit the traditional backdoor threat model and incorporate additional video-related aspects to that model. We show that poisoned-label image backdoor attacks could be extended temporally in two ways, statically and dynamically, leading to highly effective attacks in the video domain. In addition, we explore natural video backdoors to highlight the seriousness of this vulnerability in the video domain. And, for the first time, we study multi-modal (audiovisual) backdoor attacks against video action recognition models, where we show that attacking a single modality is enough for achieving a high attack success rate.
translated by 谷歌翻译
The ability to distinguish between different movie scenes is critical for understanding the storyline of a movie. However, accurately detecting movie scenes is often challenging as it requires the ability to reason over very long movie segments. This is in contrast to most existing video recognition models, which are typically designed for short-range video analysis. This work proposes a State-Space Transformer model that can efficiently capture dependencies in long movie videos for accurate movie scene detection. Our model, dubbed TranS4mer, is built using a novel S4A building block, which combines the strengths of structured state-space sequence (S4) and self-attention (A) layers. Given a sequence of frames divided into movie shots (uninterrupted periods where the camera position does not change), the S4A block first applies self-attention to capture short-range intra-shot dependencies. Afterward, the state-space operation in the S4A block is used to aggregate long-range inter-shot cues. The final TranS4mer model, which can be trained end-to-end, is obtained by stacking the S4A blocks one after the other multiple times. Our proposed TranS4mer outperforms all prior methods in three movie scene detection datasets, including MovieNet, BBC, and OVSD, while also being $2\times$ faster and requiring $3\times$ less GPU memory than standard Transformer models. We will release our code and models.
translated by 谷歌翻译
Network intrusion detection systems (NIDSs) play an important role in computer network security. There are several detection mechanisms where anomaly-based automated detection outperforms others significantly. Amid the sophistication and growing number of attacks, dealing with large amounts of data is a recognized issue in the development of anomaly-based NIDS. However, do current models meet the needs of today's networks in terms of required accuracy and dependability? In this research, we propose a new hybrid model that combines machine learning and deep learning to increase detection rates while securing dependability. Our proposed method ensures efficient pre-processing by combining SMOTE for data balancing and XGBoost for feature selection. We compared our developed method to various machine learning and deep learning algorithms to find a more efficient algorithm to implement in the pipeline. Furthermore, we chose the most effective model for network intrusion based on a set of benchmarked performance analysis criteria. Our method produces excellent results when tested on two datasets, KDDCUP'99 and CIC-MalMem-2022, with an accuracy of 99.99% and 100% for KDDCUP'99 and CIC-MalMem-2022, respectively, and no overfitting or Type-1 and Type-2 issues.
translated by 谷歌翻译
使用图神经网络(GNN)的节点分类已在各种现实世界中广泛应用。但是,近年来,有令人信服的证据表明,基于GNN的淋巴结分类的性能可能会因拓扑扰动(例如随机连接或对抗性攻击)而大大恶化。已经提出了各种解决方案,例如拓扑降解方法和机理设计方法,以开发出强大的GNN基于GNN的节点分类器,但是这些作品都无法完全解决与拓扑扰动有关的问题。最近,提出了贝叶斯标签过渡模型来解决此问题,但其缓慢的收敛性可能导致劣等性能。在这项工作中,我们提出了一种新的标签推理模型,即林德(Lindt),该模型同时整合了贝叶斯标签过渡和基于拓扑的标签传播,以改善GNN对拓扑扰动的鲁棒性。 Lindt优于现有标签过渡方法,因为它通过利用基于邻里的标签传播来改善不确定节点的标签预测,从而可以更好地收敛标签推理。此外,Lindt采用不对称的Dirichlet分布作为先验,这也有助于改善标签推理。在五个图数据集上进行的广泛实验证明了Lindt在拓扑扰动的三种情况下对基于GNN的节点分类的优越性。
translated by 谷歌翻译
由于对不同部门的电子芯片的需求不断增长,因此,半导体公司被授权离岸其制造流程。这一不必要的事情使他们对筹码的筹码有关,并引起了硬件攻击的创造。在这种情况下,半导体供应链中的不同实体可以恶意行事,并对从设备到系统的设计计算层进行攻击。我们的攻击是一个硬件特洛伊木马,在不受信任的铸造厂中插入了在面具的生成/制造过程中。特洛伊木马在制造,通过添加,删除或设计单元的变化中留下了脚印。为了解决这个问题,我们在这项工作中提出了可解释的视觉系统,用于硬件测试和保证(EVHA),可以检测以低成本,准确和快速的方式对设计的最小变化。该系统的输入是从正在检查的集成电路(IC)中获取的扫描电子显微镜(SEM)图像。系统输出是通过添加,删除或在单元格级的设计单元格中使用任何缺陷和/或硬件木马来确定IC状态。本文概述了我们的防御系统的设计,开发,实施和分析。
translated by 谷歌翻译
掌握进行手术所需的技术技能是一项极具挑战性的任务。基于视频的评估使外科医生可以收到有关其技术技能的反馈,以促进学习和发展。目前,此反馈主要来自手动视频评论,该视频审查是耗时的,限制了在许多情况下跟踪外科医生进展的可行性。在这项工作中,我们引入了一种基于运动的方法,以自动评估手术病例视频饲料的手术技能。拟议的管道首先可靠地轨道轨迹,以创建运动轨迹,然后使用这些轨迹来预测外科医生的技术技能水平。跟踪算法采用了一个简单而有效的重新识别模块,与其他最新方法相比,它可以改善ID-开关。这对于创建可靠的工具轨迹至关重要,当仪器定期在屏幕上和屏幕外移动或定期遮盖。基于运动的分类模型采用最先进的自我发明变压器网络来捕获对技能评估至关重要的短期和长期运动模式。在体内(Cholec80)数据集上评估了所提出的方法,其中专家评级的目标技能评估对Calot三角解剖的评估被用作定量技能度量。我们将基于变压器的技能评估与传统的机器学习方法进行比较,并使用拟议的和最新的跟踪方法进行比较。我们的结果表明,使用可靠跟踪方法的运动轨迹对仅根据视频流进行评估的外科医生技能是有益的。
translated by 谷歌翻译
由于对神经网络的运行推断的计算成本,因此通常需要在第三方的计算环境或硬件上部署推论步骤。如果第三方不完全信任,则需要混淆输入和输出的性质,以便第三方无法轻易确定正在执行哪些特定任务。事实证明,存在利用不受信任的政党的协议,但在实践中运行的计算要求太高了。相反,我们探索了一种不同的快速启发式安全策略,我们称之为连接主义符号伪造秘密。通过利用全息降低表示(HRR),我们创建了一个具有伪加密风格的防御的神经网络,从经验上表现出强大的攻击性,即使在不切实际地偏爱对手的威胁模型下也是如此。
translated by 谷歌翻译
手写数字识别(HDR)是光学特征识别(OCR)领域中最具挑战性的任务之一。不管语言如何,HDR都存在一些固有的挑战,这主要是由于个人跨个人的写作风格的变化,编写媒介和环境的变化,无法在反复编写任何数字等时保持相同的笔触。除此之外,特定语言数字的结构复杂性可能会导致HDR的模棱两可。多年来,研究人员开发了许多离线和在线HDR管道,其中不同的图像处理技术与传统的机器学习(ML)基于基于的和/或基于深度学习(DL)的体系结构相结合。尽管文献中存在有关HDR的广泛审查研究的证据,例如:英语,阿拉伯语,印度,法尔西,中文等,但几乎没有对孟加拉人HDR(BHDR)的调查,这缺乏对孟加拉语HDR(BHDR)的研究,而这些调查缺乏对孟加拉语HDR(BHDR)的研究。挑战,基础识别过程以及可能的未来方向。在本文中,已经分析了孟加拉语手写数字的特征和固有的歧义,以及二十年来最先进的数据集的全面见解和离线BHDR的方法。此外,还详细讨论了一些涉及BHDR的现实应用特定研究。本文还将作为对离线BHDR背后科学感兴趣的研究人员的汇编,煽动了对相关研究的新途径的探索,这可能会进一步导致在不同应用领域对孟加拉语手写数字进行更好的离线认识。
translated by 谷歌翻译
Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of <instrument, verb, target> combination delivers comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms by competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.
translated by 谷歌翻译